Computationally Intensive and Noisy Tasks: Co-Evolutionary Learning and Temporal Difference Learning on Backgammon
نویسنده
چکیده
The most difficult but realistic learning tasks are both noisy and computationally intensive. This paper investigates how, for a given solution representation, coevolutionary learning can achieve the highest ability from the least computation time. Using a population of Backgammon strategies, this paper examines ways to make computational costs reasonable. With the same simple architecture Gerald Tesauro used for Temporal Difference learning to create the Backgammon strategy “Pubeval”, co-evolutionary learning here creates a better player.
منابع مشابه
Why Co-Evolution beats Temporal Difference learning at Backgammon for a linear architecture, but not a non-linear architecture
The No Free Lunch theorems show that the algorithm must suit the problem. This does not answer the novice’s question: for a given problem, which algorithm to use? This paper compares co-evolutionary learning and temporal difference learning on the game of Backgammon, which (like many real-world tasks) has an element of random uncertainty. Unfortunately, to fully evaluate a single strategy using...
متن کاملCo-Evolutionary Learning on Noisy Tasks
This paper studies the effect of noise on coevolutionary learning, using Backgammon as a typical noisy task. It might seem that co-evolutionary learning would be ill-suited to noisy tasks: genetic drift causes convergence to a population of similar individuals, and on noisy tasks it would seem to require many samples (i.e., many evaluations, and long computation time) to discern small differenc...
متن کاملCoevolution of a Backgammon Player
One of the persistent themes in Artificial Life research is the use of co-evolutionary arms races in the development of specific and complex behaviors. However, other than Sims’s work on artificial robots, most of the work has attacked very simple games of prisoners dilemma or predator and prey. Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to devel...
متن کاملWhy did TD-Gammon Work?
Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or even other games. We were able to replicate some of the success of TD-Gammon, developing a competitive evaluation function on a 4000 parameter feed-forward neural network, without using back-propagation, reinforcement ...
متن کاملImproving Temporal Difference Learning Performance in Backgammon Variants
Palamedes is an ongoing project for building expert playing bots that can play backgammon variants. As in all successful modern backgammon programs, it is based on neural networks trained using temporal difference learning. This paper improves upon the training method that we used in our previous approach for the two backgammon variants popular in Greece and neighboring countries, Plakoto and F...
متن کامل